C3: A Parallel Model for Coarse-Grained Machines
نویسندگان
چکیده
In this paper, we propose a model for parallel computation, tile C3-modcl. The C3 _ model evaluates, for a given parallel algorithm and target architecture, the complexity of computation, the pattern of communication, and the potential congestion arising during communication. A metric for estimating the effect of link and processor congestion on the performance of a communication operation is developed. This metric allows the evaluation of arbiLrary communication operations without the user having to specify fine scheduling details. We describe how the C3-model can serve a'i a platform for the development of coarsegrained algorithms sensitive to the parameters of a parallel machine. The initial validation of the C3-model is discussed for the Inlel Touchstone Delta. We compare predicted and actual performance of different solutions for communication operations and of various divide-andconquer approaches for contour ranking on images.
منابع مشابه
C3: an architecture-independent model for coarse-grained parallel machines
We propose an architecture-independent parallel model, the C 3-model. The C 3-model evaluates, for a given parallel algorithm and target architecture, the complexity of computation, the pattern of communication , and the potential congestion arising in communication operations. A metric for estimating the eeect of link and processor congestion on the performance of an arbitrary communication op...
متن کاملCoarse grained parallel algorithms for graph matching
Parallel graph algorithm design is a very well studied topic. Many results have been presented for the PRAM model. However, these algorithms are inherently fine grained and experiments show that PRAM algorithms do often not achieve the expected speedup on real machines because of large message overheads. In this paper, we present coarse grained parallel graph algorithms with small message overh...
متن کاملPACK/UNPACK on Coarse-Grained Distributed Memory Parallel Machines
PACK/UNPACK are Fortran 90/HPF array construction functions which derive new arrays from existing arrays. We present algorithms for performing these operations on coarse-grained parallel machines. Our algorithms are relatively architecture independent and can be applied to arrays of arbitrary dimensions with arbitrary distributionalong every dimension. Experimental results are presented on the
متن کاملCommunication-Efficient Deterministic Parallel Algorithms for Planar Point Location and 2d Voronoi Diagram
In this paper we describe deterministic parallel algorithms for planar point location and for building the Voronoi Diagram of n co-planar points. These algorithms are designed for BSP-like models of computation, where p processors, with O(~) ~> O(1) local memory each, communicate through some arbitrary interconnection network. They axe communication-efficient since they require, respectively, O...
متن کاملVector Prefix and Reduction Computation on Coarse-Grained, Distributed-Memory Parallel Machines
Vector prefix and reduction are collective communication primitives in which all processors must cooperate. We present two parallel algorithms, the direct algorithm and the split algorithm, for vector prefix and reduction computation on coarse-grained, distributed-memory parallel machines. Our algorithms are relatively architecture independent and can be used effectively in many applications su...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Parallel Distrib. Comput.
دوره 32 شماره
صفحات -
تاریخ انتشار 1996